general position assumption
5227fa9a19dce7ba113f50a405dcaf09-AuthorFeedback.pdf
We thank the authors for their careful reading and helpful comments. They also noted the scalability concerns. On very large networks, the precision of MIP-solvers may also become an issue. R#1: "The only thing I feel is missing, is a discussion on how to verify if a network is in general position, and We will discuss this in the paper. R#2: Experimental questions/suggestions and "...are there any other properties of networks that could be in-33 We will present a more complete version of Table 1 in future iterations.
Review for NeurIPS paper: Exactly Computing the Local Lipschitz Constant of ReLU Networks
Weaknesses: 1. Except for section 3, other theoretical findings/results seem rather standard. For example, the reformulation techniques involved in section 5 have been widely used in mixed-integer programs and even for certifying the adversarial robustness, e.g., SMT solver, nothing new. Also, theorem 1 can be easily obtained via several simple inductions. ReLU network is subdifferential regular under the general position assumption? See Theorem 49 in [1] for details.
An Exact Poly-Time Membership-Queries Algorithm for Extraction a three-Layer ReLU Network
We consider the natural problem of learning a ReLU network from queries, which was recently remotivated by model extraction attacks. In this work, we present a polynomial-time algorithm that can learn a depth-two ReLU network from queries under mild general position assumptions. We also present a polynomial-time algorithm that, under mild general position assumptions, can learn a rich class of depth-three ReLU networks from queries. For instance, it can learn most networks where the number of first layer neurons is smaller than the dimension and the number of second layer neurons. These two results substantially improve state-of-the-art: Until our work, polynomial-time algorithms were only shown to learn from queries depth-two networks under the assumption that either the underlying distribution is Gaussian (Chen et al. (2021)) or that the weights matrix rows are linearly independent (Milli et al. (2019)). For depth three or more, there were no known poly-time results. With the growth of neural-network-based applications, many commercial companies offer machine learning services, allowing public use of trained networks as a black-box. Those networks allow the user to query the model and, in some cases, return the exact output of the network to allow the users to reason about the model's output.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Workflow (0.68)
- Research Report (0.64)